%===================================================================================
\section{C example problems}\label{s:ex_c}
%===================================================================================

\subsection{A serial dense example: kinFerTron\_dns}\label{ss:kinFerTron_dns}

As an initial illustration of the use of the {\kinsol} package for the
solution of nonlinear systems, we give a sample program called \id{kinFerTron\_dns.c}.
It uses the {\kinsol} dense linear solver module {\sunlinsoldense} 
and the {\nvecs} module (which provides a serial implementation of {\nvector})
for the solution of the Ferraris-Tronconi test problem~\cite{FlPa:99}.

This problem involves a blend of trigonometric and exponential terms:
\begin{equation}
\begin{split}
  & 0 = 0.5 \sin(x_1 x_2) - 0.25 x_2/\pi - 0.5 x_1 \\
  & 0 = (1-0.25/\pi) ( e^{2 x_1} - e ) + e x_2 / \pi - 2 e x_1 \\
  &\text{subject to } \\
  &\qquad x_{1\min} = 0.25 \le x_1 \le 1 = x_{1\max} \\
  &\qquad x_{2\min} = 1.5 \le x_2 \le 2\pi = x_{2\max} \, .
\end{split}
\end{equation}

The bounds constraints on $x_1$ and $x_2$ are treated by introducing
four additional variables and using {\kinsol}'s optional constraints
feature to enforce non-positivity and non-negativity:
\begin{equation*}
\begin{split}
  &l_1 = x_1 - x_{1\min} \ge 0\\
  &L_1 = x_1 - x_{1\max} \le 0\\
  &l_2 = x_2 - x_{2\min} \ge 0\\
  &L_2 = x_2 - x_{2\max} \le 0 \, .
\end{split}
\end{equation*}

The Ferraris-Tronconi problem has two known solutions. We solve it with
{\kinsol} using two sets of initial guesses for $x_1$ and $x_2$ (first their 
lower bounds and secondly the middle of their feasible regions), both with
an exact and modified Newton method, with and without line search.

Following the initial comment block, this program has a number
of \id{\#include} lines, which allow access to useful items in {\cvode}
header files. 
%%
The \id{kinsol.h} file provides prototypes for the {\kinsol}
functions to be called (excluding the linear solver selection
function), and also a number of constants that are to be used in
setting input arguments and testing the return value of \id{KINSol}.
%%
The \id{nvector\_serial.h} file is the header file for the serial
implementation of the {\nvector} module and includes definitions of the 
\id{N\_Vector} type, a macro to access vector components, and prototypes 
for the serial implementation specific machine environment memory allocation
and freeing functions.
%%
The \id{sunmatrix\_dense.h} file provides the prototype for the \id{SUNDenseMatrix} function.
%%
The \id{sunlinsol\_dense.h} file provides the prototype for the \id{SUNLinSol\_Dense} function.
%%
The \id{sundials\_types.h} file provides the definition of the
type \id{sunrealtype} (see \ugref{s:types} for details).  
For now, it suffices to read \id{sunrealtype} as \id{double}.
%%
Finally, \id{sundials\_math.h} is included for the definition of the
exponential function \id{RExp}.

Next, the program defines some problem-specific constants, which are
isolated to this early location to make it easy to change them as needed.  
%%
This program includes a user-defined accessor macro, \id{Ith}, that
is useful in writing the problem functions in a form closely matching 
the mathematical description of the system, i.e. with components numbered 
from 1 instead of from 0. The \id{Ith} macro is used to access components 
of a vector of type \id{N\_Vector} with a serial implementation.  
It is defined using the {\nvecs} accessor macro \id{NV\_Ith\_S} which 
numbers components starting with 0.
%%
The program prologue ends with prototypes of the user-supplied system function 
\id{func} and several private helper functions.

The \id{main} program begins with some dimensions and type declarations,
including use of the type \id{N\_Vector}, initializations, and allocation
and definitions for the user data structure \id{data} which contains two
arrays with lower and upper bounds for $x_1$ and $x_2$. Next, we create five
serial vectors of type \id{N\_Vector} for the two different initial guesses,
the solution vector \id{u}, the scaling factors, and the constraint specifications.

The initial guess vectors \id{u1} and \id{u2} are set by the private
functions \id{SetInitialGuess1} and \id{SetInitialGuess2} and the constraint
vector \id{c} is initialized to $[0,0,1,-1,1,-1]$ indicating that there are
no additional constraints on the first two components of \id{u} (i.e. $x_1$ and 
$x_2$) and that the 3rd and 5th components should be non-negative, while
the 4th and 6th should be non-positive.

The calls to \id{N\_VNew\_Serial}, and also later calls to various \id{KIN***}
functions, make use of a private function, \id{check\_flag}, which examines
the return value and prints a message if there was a failure.  The
\id{check\_flag} function was written to be used for any serial {\sundials}
application.

The call to \id{KINCreate} creates the {\kinsol} solver memory block.
Its return value is a pointer to that memory block for this
problem.  In the case of failure, the return value is \id{NULL}.  This
pointer must be passed in the remaining calls to {\kinsol} functions.

The next four calls to {\kinsol} optional input functions specify the 
pointer to the user data structure (to be passed to all subsequent calls
to \id{func}), the vector of additional constraints, and the function and
scaled step tolerances, \id{fnormtol} and \id{scsteptol}, respectively.

Solver memory is allocated through the call to \id{KINInit} which
specifies the system function \id{func} and provides the vector \id{u}
which will be used internally as a template for cloning additional necessary
vectors of the same type as \id{u}. 
%%
The use of the dense linear solver is specified by calling
\id{SUNDenseMatrix} to create the template Jacobian matrix (which also
specifies the problem size \id{NEQ}), then calling
\id{SUNLinSol\_Dense} to create the dense-direct linear solver, and
finally calling \id{KINSetLinearSolver} to attach these to {\kinsol}.

The main program proceeds by solving the nonlinear system eight times, using 
each of the two initial guesses \id{u1} and \id{u2} (which are first copied
into the vector \id{u} using \id{N\_VScale\_Serial} from
the {\nvecs} module), with and without globalization through line search
(specified by setting \id{glstr} to \id{KIN\_LINESEARCH} and \id{KIN\_NONE},
respectively), and applying either an exact or a modified Newton method.
The switch from exact to modified Newton is done by changing the number of 
nonlinear iterations after which a Jacobian evaluation is enforced, a value
\id{mset}$=1$ thus resulting in re-evaluating the Jacobian at every single
iteration of the nonlinear solver (exact Newton method). Note that passing
\id{mset}$=0$ indicates using the default {\kinsol} value of 10.

The actual problem solution is carried out in the private function \id{SolveIt}
which calls the main solver function \id{KINSol} after first setting the optional
input \id{mset}. After a successful return from \id{KINSol}, the solution
$[x_1, x_2]$ and some solver statistics are printed. 

The function \id{func} is a straightforward expression of the extended nonlinear 
system. It uses the macro \id{NV\_DATA\_S} (defined by the {\nvecs} module)
to extract the pointers to the data arrays of the \id{N\_Vector}s \id{u} and \id{f}
and sets the components of \id{fdata} using the current values for the components
of \id{udata}.
See \ugref{ss:sysFn} for a detailed specification of \id{f}.

The output generated by \id{kinFerTron\_dns} is shown below.

%%
\includeOutput{kinFerTron\_dns}{../../examples/kinsol/serial/kinFerTron_dns.out}
%%



\subsection{A serial Krylov example: kinFoodWeb\_kry}\label{ss:kinFoodWeb_kry}

We give here an example that illustrates the use of {\kinsol} with the Krylov
method {\spgmr}, via the {\sunlinsolspgmr} module, as the linear system solver.

This program solves a nonlinear system that arises from a discretized system of partial
differential equations. The PDE system is a six-species food web population
model, with predator-prey interaction and diffusion on the unit square in
two dimensions. Given the dependent variable vector of species concentrations
$c = [c_1, c_2,..., c_{n_s}]^T$, where $n_s = 2 n_p$ is the number of species 
and $n_p$ is the number of predators and of prey, then
the PDEs can be written as
\begin{equation}\label{e:kinFoodWeb_kry_pde}
  d_i \cdot \left( \frac{\partial^2 c_i}{\partial x^2} + 
    \frac{\partial^2 c_i}{\partial y^2} \right) + f_i(x,y,c) = 0
  \quad (i=1,...,n_s) \, ,
\end{equation}
where the subscripts $i$ are used to distinguish the species, and where
\begin{equation}\label{e:kinFoodWeb_kry_fterm}
f_i(x,y,c) = c_i \cdot \left(b_i + \sum_{j=1}^{n_s} a_{i,j} \cdot c_j \right) \, .
\end{equation}
The problem coefficients are given by
\begin{equation*}
  a_{ij} = 
  \begin{cases}
    -1                 & i=j \\
    -0.5 \cdot 10^{-6} & i \leq n_p , ~ j > n_p  \\
    10^4               & i > n_p , ~ j \leq n_p  \\
    0                  & \mbox{all other } \, ,
  \end{cases}
\end{equation*}
%%
\begin{equation*}
  b_i = b_i(x,y) = 
  \begin{cases}
    1 + \alpha xy   & i \leq n_p  \\
    -1 - \alpha xy   & i > n_p \, ,
  \end{cases}
\end{equation*}
and
%%
\begin{equation*}
  d_i = 
  \begin{cases}
    1 & i \leq n_p  \\
    0.5 & i > n_p  \, .
  \end{cases}
\end{equation*}
The spatial domain is the unit square $(x,y) \in [0,1] \times [0,1]$.

Homogeneous Neumann boundary conditions are imposed and the initial
guess is constant in both $x$ and $y$. For this example, the equations
(\ref{e:kinFoodWeb_kry_pde}) are discretized spatially with standard central finite
differences on a $8 \times 8$ mesh with $n_s = 6$, giving a system of size $384$.

Among the initial \id{\#include} lines in this case are lines to
include \id{sunlinsol\_spgmr.h} and \id{sundials\_math.h}.  The first contains
constants and function prototypes associated with the {\spgmr} solver module.
The inclusion of \id{sundials\_math.h} is done to access the \id{SUNMAX} and
\id{SUNRabs} macros, and the \id{SUNRsqrt} function to compute the square root
of a \id{sunrealtype} number.

The \id{main} program calls \id{KINCreate} and then calls \id{KINInit} with the
name of the user-supplied system function \id{func} and solution vector as
arguments.  The \id{main} program then calls a number of \id{KINSet*}
routines to notify {\kinsol} of the user data pointer, the
positivity constraints on the solution, and convergence tolerances on
the system function and step size.
It calls \id{SUNLinSol\_SPGMR} (see \ugref{sss:lin_solv_init}) to
create the {\spgmr} linear solver module, supplying the \id{maxl}
value of $15$ as the maximum Krylov subspace dimension.  It then calls
\id{KINSetLinearSolver} to attach this solver module to {\kinsol}.
Next, a maximum value of \id{maxlrst} $=2$ restarts is imposed through
a call to \id{SUNLinSol\_SPGMRSetMaxRestarts}.  Finally, the
user-supplied preconditioner setup and solve functions,
\id{PrecSetupBD} and \id{PrecSolveBD}, are specified through a call to
\id{KINSetPreconditioner} (see \ugref{ss:optional_input}).
The \id{data} pointer passed to \id{KINSetUserData} is passed to \id{PrecSetupBD} 
and \id{PrecSolveBD} whenever these are called. 

Next, \id{KINSol} is called, the return value is tested for error conditions, and
the approximate solution vector is printed via a call to \id{PrintOutput}.
After that, \id{PrintFinalStats} is called to get and print final statistics, and
memory is freed by calls to \id{N\_VDestroy\_Serial}, \id{FreeUserData} and \id{KINFree}.
The statistics printed are the total numbers of nonlinear iterations (\id{nni}),
of \id{func} evaluations (excluding those for $Jv$ product evaluations) (\id{nfe}),
of \id{func} evaluations for $Jv$ evaluations (\id{nfeSG}), of linear (Krylov)
iterations (\id{nli}), of preconditioner evaluations (\id{npe}), and of
preconditioner solves (\id{nps}). All of these optional outputs and others are
described in \ugref{ss:optional_output}.

Mathematically, the dependent variable has three dimensions: species
number, $x$ mesh point, and $y$ mesh point.  But in {\nvecs}, a vector of
type \id{N\_Vector} works with a one-dimensional contiguous array of
data components. The macro \id{IJ\_Vptr} isolates the translation from
three dimensions to one. Its use results in clearer code and makes it
easy to change the underlying layout of the three-dimensional data. 
Here the problem size is $384$, so we use the \id{NV\_DATA\_S} macro
for efficient \id{N\_Vector} access. The \id{NV\_DATA\_S} macro gives
a pointer to the first component of a serial \id{N\_Vector} which is then
passed to the \id{IJ\_Vptr} macro.

The preconditioner used here is the block-diagonal part of the true Newton
matrix and is based only on the partial derivatives of the interaction terms $f$
in (\ref{e:kinFoodWeb_kry_fterm}) and hence its  diagonal blocks are $n_s \times n_s$ matrices
($n_s = 6$).
It is generated and factored in the \id{PrecSetupBD} routine and
backsolved in the \id{PrecSolveBD} routine.  
See \ugref{ss:precondFn} for detailed descriptions
of these preconditioner functions.

The program \id{kinFoodWeb\_kry.c} uses the ``small'' dense functions for all operations 
on the $6 \times 6$ preconditioner blocks.  
Thus it includes \id{sundials\_dense.h}, and calls the small dense matrix
functions \id{newDenseMat}, \id{denseGETRF}, and \id{denseGETRS}.
The small dense functions are generally available for {\kinsol} user programs
(for more information, see the comments in the header file \id{sundials\_dense.h}).

In addition to the functions called by {\kinsol}, \id{kinFoodWeb\_kry.c} includes
definitions of several private functions.  These are: \id{AllocUserData}
to allocate space for the preconditioner and the pivot arrays; \id{InitUserData}
to load problem constants in the \id{data} block; \id{FreeUserData} to free
that block; \id{SetInitialProfiles} to load the initial values in \id{cc};
\id{PrintHeader} to print the heading for the output;
\id{PrintOutput} to retrieve and print selected solution values;
\id{PrintFinalStats} to print statistics; and \id{check\_flag}
to check return values for error conditions.

The output generated by \id{kinFoodWeb\_kry} is shown below.  Note that the
solution involved 10 Newton iterations, with an average of about 38
Krylov iterations per Newton iteration.

\includeOutput{kinFoodWeb\_kry}{../../examples/kinsol/serial/kinFoodWeb_kry.out}

%-----------------------------------------------------------------------------------

\subsection{A parallel example: kinFoodWeb\_kry\_bbd\_p}\label{ss:kinFoodWeb_kry_bbd_p}

In this example, \id{kinFoodWeb\_kry\_bbd\_p}, we solve the same
problem as with \id{kinFoodWeb\_kry} above, but in parallel, and
instead of supplying the preconditioner we use the {\kinbbdpre} module.  

In this case, we think of the parallel {\mpi} processes as
being laid out in a rectangle, and each process being assigned a
subgrid of size \id{MXSUB}$\times$\id{MYSUB} of the $x-y$ grid. If
there are \id{NPEX} processes in the $x$ direction and \id{NPEY}
processes in the $y$ direction, then the overall grid size is
\id{MX}$\times$\id{MY} with \id{MX}$=$\id{NPEX}$\times$\id{MXSUB} and
\id{MY}$=$\id{NPEY}$\times$\id{MYSUB}, and the size of the nonlinear system is
\id{NUM\_SPECIES}$\cdot$\id{MX}$\cdot$\id{MY}.  

The evaluation of the nonlinear system function is performed in \id{func}.
In this parallel setting, the processes first communicate the subgrid
boundary data and then compute the local components of the nonlinear system
function. The {\mpi} communication is isolated in the private function \id{ccomm}
(which in turn calls \id{BRecvPost}, \id{BSend}, and \id{BRecvWait}) and the 
subgrid boundary data received from neighboring processes is loaded into the
work array \id{cext}. The computation of the nonlinear system function is done
in \id{func\_local} which starts by copying the local segment of the \id{cc}
vector into \id{cext}, and then by imposing the boundary conditions by copying the
first interior mesh line from \id{cc} into \id{cext}. After this, the nonlinear
system function is evaluated by using central finite-difference approximations
using the data in \id{cext} exclusively.

{\kinbbdpre} uses a band-block-diagonal preconditioner, generated by difference
quotients.  The upper and lower half-bandwidths of the Jacobian block generated
on each process are both equal to $2 \cdot n_s - 1$, and that is the value
passed as \id{mudq} and \id{mldq} in the call to \id{KINBBDPrecInit}.
These values are much less than the true half-bandwidths of the Jacobian blocks,
which are $n_s \cdot$ \id{MXSUB}.  However, an even narrower band matrix
is retained as the preconditioner, with half-bandwidths equal to $n_s$, and this
is the value passed to \id{KINBBDPrecInit} for \id{mukeep} and \id{mlkeep}.

The function \id{func\_local} is also passed as the \id{gloc} argument to 
\id{KINBBDPrecInit}. Since all communication needed for the evaluation of the
local approximation of $f$ used in building the band-block-diagonal preconditioner
is already done for the evaluation of $f$ in \id{func}, a \id{NULL} pointer is
passed as the \id{gcomm} argument to \id{KINBBDPrecInit}.

The \id{main} program resembles closely that of the \id{kinFoodWeb\_kry} example, with
particularization arising from the use of the parallel {\mpi} {\nvecp} module.
It begins by initializing {\mpi} and obtaining the total number of processes and 
the rank of the local process. The local length of the solution vector is then 
computed as \id{NUM\_SPECIES}$\cdot$\id{MXSUB}$\cdot$\id{MYSUB}.
Distributed vectors are created by calling the constructor defined in {\nvecp}
with the {\mpi} communicator and the local and global problem sizes as arguments.
All output is performed only from the process with id equal to $0$.
Finally, after \id{KINSol} is called and the results are printed, all memory
is deallocated, and the {\mpi} environment is terminated by calling \id{MPI\_Finalize}.

The output generated by \id{kinFoodWeb\_kry\_bbd\_p} is shown below.  Note that 9 Newton
iterations were required, with an average of about 51.6 Krylov iterations
per Newton iteration.

\includeOutput{kinFoodWeb\_kry\_bbd\_p}{../../examples/kinsol/parallel/kinFoodWeb_kry_bbd_p.out}


